feat: add CacheReadInputTokens and CacheWriteInputTokens to TokenUsageRecord#229
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
eac105d to
ae135f8
Compare
7c4899b to
2edd612
Compare
2edd612 to
5fb3414
Compare
5fb3414 to
6480378
Compare
ssncferreira
left a comment
There was a problem hiding this comment.
LGTM, just a few questions regarding the corresponding coder changes 🦷
| MsgID: completion.ID, | ||
| Input: calculateActualInputTokenUsage(lastUsage), | ||
| Output: lastUsage.CompletionTokens, | ||
| CacheReadInputTokens: lastUsage.PromptTokensDetails.CachedTokens, |
There was a problem hiding this comment.
No CacheWriteInputTokens?
There was a problem hiding this comment.
OpenAI doesn't provide this information.
| "cache_creation_input": resp.Usage.CacheCreationInputTokens, | ||
| "cache_read_input": resp.Usage.CacheReadInputTokens, |
There was a problem hiding this comment.
I'm guessing this will have a corresponding PR on coder side, right? Because IIRC this is being used in the session feature to sum up the tokens. cc @dannykopping
| require.Equal(t, 11904.0, promtest.ToFloat64(m.TokenUseCount.WithLabelValues(config.ProviderOpenAI, "gpt-4.1", "input_cached", defaultActorID, clientLabel))) | ||
| require.Equal(t, 0.0, promtest.ToFloat64(m.TokenUseCount.WithLabelValues(config.ProviderOpenAI, "gpt-4.1", "output_reasoning", defaultActorID, clientLabel))) | ||
| require.Equal(t, 12077.0, promtest.ToFloat64(m.TokenUseCount.WithLabelValues(config.ProviderOpenAI, "gpt-4.1", "total_tokens", defaultActorID, clientLabel))) |
There was a problem hiding this comment.
Do we have any grafana dashboards that use these labels?
There was a problem hiding this comment.
Good question. I've summarized metric labels regarding cached tokens similar to how it is done in DB records but maybe I should keep both.
My intuition was to keep parity between DB records and metric records but it is not possible to update metrics like DB records can be migrated.
But if dashboard are not used much maybe it would be better to just change labels and move fast? I'll check if those labels are used now or not.
There was a problem hiding this comment.
Reverted the removal of cached tokens from ExtraTokenTypes field.
I've created a follow up issue to remove then later: #243
Though it would be easier to transition from one format to new one if 1 or 2 versions will populate both.
fc253dc to
0d14626
Compare
0d14626 to
226595a
Compare
coder/aibridge@519b082...a011104 Includes coder/aibridge#242 and coder/aibridge#229 Signed-off-by: Danny Kopping <danny@coder.com>
dannykopping
left a comment
There was a problem hiding this comment.
Post-merge approval ✅
Thanks for adding a follow-up as well!
Two new columns added to aibridge_token_usages: - cache_read_input_tokens (BIGINT, default 0) - cache_write_input_tokens (BIGINT, default 0) Migration backfills existing rows by extracting values from the metadata JSONB column (cache_read_input, input_cached, prompt_cached for reads (max value selected since only 1 should be set), cache_creation_input for writes). All references to data from metadata were updated to reference new columns. No other changes then changing where data is extracted from. Requires aibridge library version bump to include: coder/aibridge#229 Fixes: coder/aibridge#150

Added cache read and cache write input token as a first-class value.
Added
CacheReadInputTokensandCacheWriteInputTokensfields toTokenUsageRecordstruct.Removed those values from
ExtraTokenTypesfield.Part 1 of: #150